1.MBTI Mirror: Community Engagement Disposition Mapping (EDA)
Through in-depth exploratory data analysis of post and comment data from the MBTI subreddit community, we discovered significant engagement changes and community activity trends. The analysis shows engagement patterns across different MBTI types, revealing that certain types of posts and comments are more frequent than others. This finding provides valuable insight into the distribution and interactions of personality types within communities.
Code
# load the preprocessed dataimport pandas as pdimport plotly.graph_objects as gofrom plotly.subplots import make_subplotsimport kaleidodf = pd.read_csv("../data/csv/time-series.csv")# Create subplotsfig = make_subplots(rows=2, cols=1, subplot_titles=("Monthly Submission Counts", "Average Comments and Scores"))# Line plot fig.add_trace( go.Scatter(x=df['submission_year'].astype(str) +'-'+ df['submission_month'].astype(str), y=df['count(submission_title)'], name='Submission Counts', mode='lines+markers'), row=1, col=1)# Bar plotfig.add_trace( go.Bar(x=df['submission_year'].astype(str) +'-'+ df['submission_month'].astype(str), y=df['avg_num_comments'], name='Average Number of Comments'), row=2, col=1)# Scatter plot fig.add_trace( go.Scatter(x=df['submission_year'].astype(str) +'-'+ df['submission_month'].astype(str), y=df['avg_score'], name='Average Score', mode='markers'), row=2, col=1)# Update xaxisfig.update_xaxes(title_text="Date", row=1, col=1)fig.update_xaxes(title_text="Date", row=2, col=1)# Update yaxis fig.update_yaxes(title_text="Submission Counts", row=1, col=1)fig.update_yaxes(title_text="Average Values", row=2, col=1)# Update title and layoutfig.update_layout( title_text="Reddit MBTI Topic Submission Trends", title_x=0.45, # This centers the title showlegend=True,)fig.update_layout( title=dict( font=dict( size=20, # Adjusting the font size ) ))# Show the figurefig.show()
2.MBTI Context: Exploring the discourse world of introversion and extroversion (NLP)
Using NLP technology to analyze the discussion content in the community, we successfully identified the main discussion topics and key words about introversion and extroversion. This not only reveals common perceptions of these character traits among community members, but also demonstrates differences in emotional tendencies and word choice. These insights provide a deeper understanding of the nuances of language expression across MBTI types.
3.MBTI Decoded: The Power of Predicting Personality with Machine Learning (ML)
By building and applying machine learning models, we successfully predicted MBTI types, demonstrating the model’s remarkable performance in accuracy and performance. Additionally, our analysis explores potential links between MBTI types and health data, revealing specific health risks and patterns that different personality types may face. These insights are extremely important for designing targeted health intervention strategies.
Code
import pandas as pdimport plotly.graph_objs as goimport networkx as nxrules_pd=pd.read_csv("../data/csv/ordered_association_rules.csv")# Create a graphG = nx.DiGraph()# Add nodes and edgesfor _, row in rules_pd.iterrows(): G.add_edge(str(row['antecedent']), str(row['consequent']), weight=row['confidence'])# Generate position layoutpos = nx.spring_layout(G)# Create edge traceedge_x = []edge_y = []for edge in G.edges(): x0, y0 = pos[edge[0]] x1, y1 = pos[edge[1]] edge_x.extend([x0, x1, None]) edge_y.extend([y0, y1, None])edge_trace = go.Scatter( x=edge_x, y=edge_y, line=dict(width=0.5, color='#888'), hoverinfo='none', mode='lines')# Create node tracenode_x = []node_y = []for node in G.nodes(): x, y = pos[node] node_x.append(x) node_y.append(y)#color_list=['#ffffd9', '#f5fbc4', '#eaf7b1', '#d6efb3', '#bde5b5', '#97d6b9', '#73c8bd', '#52bcc2', '#37acc3', '#2498c1', '#1f80b8', '#2165ab', '#234da0', '#253795', '#172978', '#081d58']color_list=['#081d58','#253795','#1f80b8','#97d6b9','#ffffd9']node_trace = go.Scatter( x=node_x, y=node_y, mode='markers+text', # Add 'text' to the mode text=[node for node in G.nodes()], # Add node labels textposition="bottom center", # Position of text hoverinfo='text', marker=dict( showscale=True, colorscale=color_list, reversescale=True, color=[], size=15, # Increase node size colorbar=dict( thickness=15, title='Number of Node Connections', xanchor='left', titleside='right' ), line_width=1.5))# Add node text and hover infonode_adjacencies = []node_text = []for node, adjacencies inenumerate(G.adjacency()): node_adjacencies.append(len(adjacencies[1])) node_text.append(f'{adjacencies[0]}')node_trace.marker.color = node_adjacenciesnode_trace.text = node_text# Create figurefig = go.Figure(data=[edge_trace, node_trace], layout=go.Layout( title='<br>Network graph of association rules', titlefont_size=23, showlegend=False, hovermode='closest', margin=dict(b=20,l=5,r=5,t=80), annotations=[dict( text="Python plotly library", showarrow=False, xref="paper", yref="paper", x=0.005, y=-0.002)], xaxis=dict(showgrid=False, zeroline=False, showticklabels=False), yaxis=dict(showgrid=False, zeroline=False, showticklabels=False)) )fig.show()
Through an in-depth analysis of the MBTI subreddit community, this project successfully achieved insights into community engagement, MBTI type-specific discussions, trends over time, posting behavior within different personality dyads, and correlations with health data. Comprehensive understanding. This project also sheds light on how personality types shape patterns of communication and behavior in a community, providing a unique insight into this complex and diverse community.
Next Steps Plan
To further deepen our understanding of the MBTI subreddit community and improve the comprehensiveness of our analysis, our future plans will include several key directions.
First, we plan to examine comments and posts that may have been deleted or removed to ensure our data set is as complete as possible to provide a more accurate analysis of community engagement.
Second, we will set out to collect more MBTI-related health data to make broader generalizations about the relationship between different personality types and health conditions. This will not only enhance our current health-related analyses, but may also reveal new interesting associations.
Finally, we plan to increase interaction with users and audiences. By participating more actively in the community, we can better understand the needs and interests of our users, thereby further improving our analysis methods and research directions.
What we learned from this project
During the exploration of this project, I found that INFP people like me like to explore topics about MBTI on Reddit. Based on the actual situation of my daily use of the Internet, it seems that this is indeed the case. I pay attention to exploring people’s inner voices. I like to listen to people telling their own stories online, and I am also very empathetic. In the exploration of MBTI and physical health, it is true that due to my personality, I like to stay at home and do not like to exercise. And this also caused my spinal health to be in a bad situation. This analysis reinforced my need to change my habits to improve my health.
In our MBTI project, I’ve come to a somewhat bittersweet realization about my own personality type: we’re the unsung heroes of the MBTI world, rarely the center of online discussions. Our analysis showed that ESTJs, like me, are less inclined to share our lives on the internet, preferring real-life interactions. And a striking personal revelation came from our health data analysis - it turns out ESTJs are prone to posture-related issues. As a desk-bound workaholic, I could relate all too well, having experienced the exact ailments our data pointed out.
As expected, individuals with F/P personality types tend to share their emotions on social media more than those with T/J types. Surprisingly, contrary to common perceptions, I found that I (introverted) personalities posted over three times more than E (extroverted) ones. This could indicate that introverts, often perceived as reserved, might find social media a comfortable platform to express themselves, more so than extroverts. Additionally, there’s a noticeable trend in MBTI discussions on social media, with peaks at the start and end of each year. This suggests that people are more inclined to explore and discuss their MBTI types during periods of self-reflection and while setting New Year’s resolutions.
Through this project, I’ve observed that individuals with an extraverted (E) personality type tend to be active on social media, and their posts are easily discoverable. On the other hand, those with an introverted (I) personality type seem to be more reserved, often avoiding the sharing of their thoughts on social platforms. Consequently, I posit that individuals with an introverted personality type are more likely to exude a sense of mystery. Additionally, I’ve come across an intriguing observation regarding the correlation between MBTI types and individuals’ health situations. This connection might be attributed to their habits and lifestyle choices.
Source Code
---title: "Conclusion" author: - Project Team 10 - Mingqian Liu, Xinyu Li, Xin Xiang, Yanfeng Zhang---# Project summary### 1.MBTI Mirror: Community Engagement Disposition Mapping (EDA)Through in-depth exploratory data analysis of post and comment data from the MBTI subreddit community, we discovered significant engagement changes and community activity trends. The analysis shows engagement patterns across different MBTI types, revealing that certain types of posts and comments are more frequent than others. This finding provides valuable insight into the distribution and interactions of personality types within communities.```{python}# load the preprocessed dataimport pandas as pdimport plotly.graph_objects as gofrom plotly.subplots import make_subplotsimport kaleidodf = pd.read_csv("../data/csv/time-series.csv")# Create subplotsfig = make_subplots(rows=2, cols=1, subplot_titles=("Monthly Submission Counts", "Average Comments and Scores"))# Line plot fig.add_trace( go.Scatter(x=df['submission_year'].astype(str) +'-'+ df['submission_month'].astype(str), y=df['count(submission_title)'], name='Submission Counts', mode='lines+markers'), row=1, col=1)# Bar plotfig.add_trace( go.Bar(x=df['submission_year'].astype(str) +'-'+ df['submission_month'].astype(str), y=df['avg_num_comments'], name='Average Number of Comments'), row=2, col=1)# Scatter plot fig.add_trace( go.Scatter(x=df['submission_year'].astype(str) +'-'+ df['submission_month'].astype(str), y=df['avg_score'], name='Average Score', mode='markers'), row=2, col=1)# Update xaxisfig.update_xaxes(title_text="Date", row=1, col=1)fig.update_xaxes(title_text="Date", row=2, col=1)# Update yaxis fig.update_yaxes(title_text="Submission Counts", row=1, col=1)fig.update_yaxes(title_text="Average Values", row=2, col=1)# Update title and layoutfig.update_layout( title_text="Reddit MBTI Topic Submission Trends", title_x=0.45, # This centers the title showlegend=True,)fig.update_layout( title=dict( font=dict( size=20, # Adjusting the font size ) ))# Show the figurefig.show()```### 2.MBTI Context: Exploring the discourse world of introversion and extroversion (NLP)Using NLP technology to analyze the discussion content in the community, we successfully identified the main discussion topics and key words about introversion and extroversion. This not only reveals common perceptions of these character traits among community members, but also demonstrates differences in emotional tendencies and word choice. These insights provide a deeper understanding of the nuances of language expression across MBTI types.### 3.MBTI Decoded: The Power of Predicting Personality with Machine Learning (ML)By building and applying machine learning models, we successfully predicted MBTI types, demonstrating the model's remarkable performance in accuracy and performance. Additionally, our analysis explores potential links between MBTI types and health data, revealing specific health risks and patterns that different personality types may face. These insights are extremely important for designing targeted health intervention strategies.```{python}import pandas as pdimport plotly.graph_objs as goimport networkx as nxrules_pd=pd.read_csv("../data/csv/ordered_association_rules.csv")# Create a graphG = nx.DiGraph()# Add nodes and edgesfor _, row in rules_pd.iterrows(): G.add_edge(str(row['antecedent']), str(row['consequent']), weight=row['confidence'])# Generate position layoutpos = nx.spring_layout(G)# Create edge traceedge_x = []edge_y = []for edge in G.edges(): x0, y0 = pos[edge[0]] x1, y1 = pos[edge[1]] edge_x.extend([x0, x1, None]) edge_y.extend([y0, y1, None])edge_trace = go.Scatter( x=edge_x, y=edge_y, line=dict(width=0.5, color='#888'), hoverinfo='none', mode='lines')# Create node tracenode_x = []node_y = []for node in G.nodes(): x, y = pos[node] node_x.append(x) node_y.append(y)#color_list=['#ffffd9', '#f5fbc4', '#eaf7b1', '#d6efb3', '#bde5b5', '#97d6b9', '#73c8bd', '#52bcc2', '#37acc3', '#2498c1', '#1f80b8', '#2165ab', '#234da0', '#253795', '#172978', '#081d58']color_list=['#081d58','#253795','#1f80b8','#97d6b9','#ffffd9']node_trace = go.Scatter( x=node_x, y=node_y, mode='markers+text', # Add 'text' to the mode text=[node for node in G.nodes()], # Add node labels textposition="bottom center", # Position of text hoverinfo='text', marker=dict( showscale=True, colorscale=color_list, reversescale=True, color=[], size=15, # Increase node size colorbar=dict( thickness=15, title='Number of Node Connections', xanchor='left', titleside='right' ), line_width=1.5))# Add node text and hover infonode_adjacencies = []node_text = []for node, adjacencies inenumerate(G.adjacency()): node_adjacencies.append(len(adjacencies[1])) node_text.append(f'{adjacencies[0]}')node_trace.marker.color = node_adjacenciesnode_trace.text = node_text# Create figurefig = go.Figure(data=[edge_trace, node_trace], layout=go.Layout( title='<br>Network graph of association rules', titlefont_size=23, showlegend=False, hovermode='closest', margin=dict(b=20,l=5,r=5,t=80), annotations=[dict( text="Python plotly library", showarrow=False, xref="paper", yref="paper", x=0.005, y=-0.002)], xaxis=dict(showgrid=False, zeroline=False, showticklabels=False), yaxis=dict(showgrid=False, zeroline=False, showticklabels=False)) )fig.show()```Through an in-depth analysis of the MBTI subreddit community, this project successfully achieved insights into community engagement, MBTI type-specific discussions, trends over time, posting behavior within different personality dyads, and correlations with health data. Comprehensive understanding. This project also sheds light on how personality types shape patterns of communication and behavior in a community, providing a unique insight into this complex and diverse community.# Next Steps PlanTo further deepen our understanding of the MBTI subreddit community and improve the comprehensiveness of our analysis, our future plans will include several key directions. - First, we plan to examine comments and posts that may have been deleted or removed to ensure our data set is as complete as possible to provide a more accurate analysis of community engagement. - Second, we will set out to collect more MBTI-related health data to make broader generalizations about the relationship between different personality types and health conditions. This will not only enhance our current health-related analyses, but may also reveal new interesting associations. - Finally, we plan to increase interaction with users and audiences. By participating more actively in the community, we can better understand the needs and interests of our users, thereby further improving our analysis methods and research directions.# What we learned from this project{style="float: left;border-radius: 5%;margin:10px 20px;" alt="" width="30%"}<br>During the exploration of this project, I found that INFP people like me like to explore topics about MBTI on Reddit. Based on the actual situation of my daily use of the Internet, it seems that this is indeed the case. I pay attention to exploring people's inner voices. I like to listen to people telling their own stories online, and I am also very empathetic. In the exploration of MBTI and physical health, it is true that due to my personality, I like to stay at home and do not like to exercise. And this also caused my spinal health to be in a bad situation. This analysis reinforced my need to change my habits to improve my health.<br>---------------------{style="float: right;border-radius: 5%;margin:10px 20px;" alt="" width="30%"}<br>In our MBTI project, I've come to a somewhat bittersweet realization about my own personality type: we're the unsung heroes of the MBTI world, rarely the center of online discussions. Our analysis showed that ESTJs, like me, are less inclined to share our lives on the internet, preferring real-life interactions. And a striking personal revelation came from our health data analysis - it turns out ESTJs are prone to posture-related issues. As a desk-bound workaholic, I could relate all too well, having experienced the exact ailments our data pointed out.<br><br>---------------------{style="float: left;border-radius: 5%;margin:10px 20px;" alt="" width="30%"}As expected, individuals with F/P personality types tend to share their emotions on social media more than those with T/J types. Surprisingly, contrary to common perceptions, I found that I (introverted) personalities posted over three times more than E (extroverted) ones. This could indicate that introverts, often perceived as reserved, might find social media a comfortable platform to express themselves, more so than extroverts. Additionally, there's a noticeable trend in MBTI discussions on social media, with peaks at the start and end of each year. This suggests that people are more inclined to explore and discuss their MBTI types during periods of self-reflection and while setting New Year's resolutions.---------------------{style="float: right;border-radius: 5%;margin:10px 20px;" alt="" width="30%"}<br>Through this project, I've observed that individuals with an extraverted (E) personality type tend to be active on social media, and their posts are easily discoverable. On the other hand, those with an introverted (I) personality type seem to be more reserved, often avoiding the sharing of their thoughts on social platforms. Consequently, I posit that individuals with an introverted personality type are more likely to exude a sense of mystery. Additionally, I've come across an intriguing observation regarding the correlation between MBTI types and individuals' health situations. This connection might be attributed to their habits and lifestyle choices.